Make life simple: unleash the full power of the parallel tempering algorithm 
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We introduce a new update scheme to systematically improve the efficiency of parallel tempering 
simulations. We show that by adapting the number of sweeps between replica exchanges to the 
canonical autocorrelation time, the average round-trip time of a replica in temperature space can 
be significantly decreased. The temperatures are not dynamically adjusted as in previous attempts 
but chosen to yield a 50% exchange rate of adjacent replicas. We illustrate the new algorithm with 
results for the Ising model in two and the Edwards- Anderson Ising spin glass in three dimensions. 
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The parallel tempering (PT), or replica exchange, sim- 
ulation technique 1, 2, 3, 4J provides an efficient method 
to investigate systems with rugged free-energy land- 
scapes 0, particularly at low temperatures. Initially, 
applications of the method were limited to problems in 
statistical physics. By now, however, PT and its exten- 
sions are used in many disciplines, e.g. biomolecules [6, 
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where A/3 = P2 — Pi is the difference between the in- 
verse temperatures of the two replicas and AE = E2~Ei 
their energy difference. The acceptance probability is the 
smaller the larger the temperature difference or the sys- 
tem size gets. For PT simulations to be most efficient, 
each replica should spend the same amount of time at 
each temperature. To this end, several strategies have 
been proposed in the last years [m [11, [H, [13, [H, [li, [13] , 
but an efficient selection of optimal temperature inter- 
vals is still an open problem. In the physically appealing 
protocol proposed by Katzgraber et al. [2^ the optimal 
temperatures are determined from the flow in tempera- 
ture space, that is, the rate of round trips between low 
and high temperatures is maximized by systematically 
re-adjusting the temperatures. 

Unfortunately, their initial recursion is rather complex 
and needs a significant amount of CPU time. Therefore, 
we do not use the idea of maximum flow and rather em- 
ploy the concept of a constant acceptance rate between 
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[T3 |. bioinformatics [13(, zeolite struc- 
, classical and quantum frustrated spin 
spin glasses 0, i, [13, [ll, [3, ^ and 
I23]. The use of PT in interdisciplinary 
fields spanning physics, chemistry, biology, engineering 
and material sciences rapidly increases. 

In a PT simulation, one generates many replicas of 
Monte Carlo (MC) Markov chains or molecular dynam- 
ics (MD) trajectories at different temperatures in paral- 
lel. At regular intervals an attempt is made to exchange 
the configurations of different, usually adjacent replicas, 
which is accepted with probability 
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where Pp.{E.i) is the probability for replica i at (3i to 
have the energy Ei (the subscript is the replica index). 
Using this formula we can calculate, starting from 
a set of inverse temperatures (3i which satisfy A{i — > 
i + 1) = const.. For systems with a diverging specific 
heat one obtains a high density of replicas around the 
critical temperature, i.e., the difference between the in- 
verse temperature of two adjacent replicas is small. For 
high values of /?, i.e. low temperatures, the difference 
between energy distributions at different temperatures 
becomes small and therefore A/3 increases. Furthermore 
for small /3 values, AE decreases and the spacing between 
the replicas grows. 

As an illustration, we shall first consider MC simu- 
lations of the two-dimensional (2D) Ising model where 
the density of states and hence ^ can be calculated ex- 
3l|. For all reasonably chosen rates ^(1 2), 



the replica flow from high to low temperatures and vice 
versa turns out to be very slow, at least when a local 
update scheme, e.g. the Metropolis algorithm, is used 
for each of the replicas. The replica flow through the 
temperature space shows a significant drop around the 
critical temperature. In Fig. [1] we show as an example 
for an acceptance rate of 50% the fraction of replicas 
which have visited most recently the smallest /? value 
and wander "up" in the inverse temperature space. This 
sharp drop-off behavior led Katzgraber et al. [28[ to their 
feedback-optimized update scheme (FBO-PT), in which 
they re-adjust the temperatures by analyzing the local 
diffusivity. 

We, on the other hand, want to remove the unwanted 
behavior at (3c, while keeping the temperatures fixed at 
their initial values. Looking at the trajectory of an ar- 
bitrarily chosen replica in temperature space shown in 
the upper plot of Fig. [21 we see a clear block structure, 
where the border of the blocks coincides with the critical 
temperature. Such a block structure is related to a bot- 
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FIG. 1: Fraction of replicas which wander from the smallest 
Pi to the largest as a function of the replica index i for the 2D 
Ising model (L = 80). The simulations without optimization 
exhibit a sharp decline close to /3c, as one can see in the in- 
set. Taking the canonical correlation times Tcan into account 
(PTt), the fraction decreases, for the same set of tempera- 
tures, almost linearly. 
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FIG. 2; Time series of an arbitrarily chosen replica on its way 
through inverse temperature space of the 2D Ising model (L = 
80). The upper plot shows the result of a PT simulation and 
the lower that of a PTr simulation with Mocai = Tcan- The 
horizontal lines indicate /3c, the infinite- volume critical point. 
The blocks in the time series are a signal of the increasing 
autocorrelation times due to critical slowing down. 



tleneck in the flow through the temperature space, or, in 
other words, for a replica starting from a high tempera- 
ture it is hard to overcome this bottleneck and move to 
the low temperature region. A plausible explanation of 
this observation is as follows: toward the critical temper- 
ature the autocorrelation time increases due to critical 
slowing down and therefore two exchanged replicas stay 
in phase space close to each other. It is hence more likely 
that these two replicas exchange again. 

To verify this idea, we use a toy model based on the 
bivariate Gaussian process with < p < 1 [32i] . 
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where eo = Cq, and the are independent Gaussian ran- 
dom variables satisfying (e-) = and (e-e^) — (e-)^ — Sij. 
Iterating this recursion it follows that the autocorrela- 
tion function is A{k) = (eoefe) = = e~'=/'^™p , where 
Texp = — 1/lnp is the exponential autocorrelation time. 
It can be shown that with increasing Toxp the mean step 
size decreases, i.e., (|ei+i— e^j) = 2\J (1 — /9)/7r, such that 
the system moves slower through the one-dimensional 
phase space, and this is what we are interested in. 

Using the stochastic process ([3]) we are able to approx- 
imate for any realistic model the movement in energy 
space during a parallel tempering simulation. From the 
energy distribution of initial canonical simulations we ob- 
tain for each of the replica at /3i the mean and variance 
which, after a trivial shift and rescaling, can be repro- 
duced with ([3]) . Next we exploit the freedom in the model 
to adjust Toxp for each temperature which allows us to in- 
vestigate the dependence of the flow through temperature 
space on the autocorrelation times. In general, simula- 
tions near a second-order phase transition are affected by 
critical slowing down, i.e., an increasing autocorrelation 
time Tcan ~ if' ^ wherc ^ denotes the (spatial) correlation 
length and z is the dynamical critical exponent. To take 



this into account, we set Toxp to the canonical autocorre- 
lation time Tcan of the energy measured in the indepen- 
dent simulations. Together with the mean and variance 
this specifies the parameters of the replicated process ([3]) . 

By fitting to 2D Ising model MC data, our first find- 
ing comes from a comparison of the autocorrelation times 
for iterations of ([3]) with and without the PT routine. As 
expected, the autocorrelation times for the PT simula- 
tion are much smaller. The flow through temperature 
space looks similar as for the 2D Ising model depicted 
in Fig. [1] We also find a pronounced decline around the 
pseudo-critical point /Jc- The reason for this behavior is, 
as already anticipated above, the slowed down dynamics 
near fi^. That means, after two adjacent replicas in the 
vicinity of /3c have been exchanged, they will stay close to 
each other and changing them back to the original state is 
more likely than an exchange with another replica. If the 
dynamics is even slower (by simply tuning Toxp larger) a 
complete trapping can be observed and the replicas do 
not move from low to high temperatures at all. 

By systematically varying the inputted autocorrelation 
times, our toy model suggests that an easy way to cure 
this problem is to increase the number of local updates 
between the PT exchanges proportional to the autocorre- 
lation time of the initial (non PT) simulation for a given 
temperature. 

This general strategy will be now first tested for the 
2D Ising model. For system sizes up to L = 80 we 
use the exact energy distributions [3l| to calculate a 
set of inverse temperatures {/3i} with an acceptance rate 
A(i ^ z + 1) = 0.5 starting from (i\ — 0.38. To cover 
almost the same temperature interval for different sys- 
tem sizes L the number of replicas N has to increase 
with L [il] ■ For this set of inverse temperatures we per- 
form short independent Metropolis MC simulations to 
estimate the canonical autocorrelation times Tcan(/3) of 
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FIG. 3: PT moves per tunneling as a function of Mocai for 
the 2D Ising model {L = 80), approaching for A'locai ~ Tcan 
the unbiased random walk limit. The inset shows the actually 
needed computing time in units of the total run time of the 
standard PT simulation. 



FIG. 4: Autocorrelation times as function of inverse temper- 
ature for the canonical simulations, the standard PT update 
scheme, the feedback-optimized PT method (FBO-PT), and 
four different runs of our improved PT update scheme (PTr) 
for the 2D Ising model (L = 80). 



the energy, together with the mean and width of the en- 
ergy distribution. In the actual simulations we then use 
the usual PT update scheme with only one important 
modification, namely, we choose the number of sweeps 
Mocai(/3) between the attempts to exchange the config- 
urations proportional to Tcan(/3) (-^locai (/i^) = 1 for stan- 
dard PT and FBO-PT simulations). The larger the num- 
ber of sweeps between the exchange attempts, the smaller 
the correlation between adjacent replicas. Therefore, one 
has to find a compromise between accuracy and computer 
time, which can be easily achieved by using our toy model 
(which runs orders of magnitude faster than the actual 
simulations). To illustrate this we include in Fig. [3] a 
comparison for different choices of iViocai- 

The main plot of Fig. [3] shows the number of PT moves 
necessary for an arbitrarily chosen replica to move from 
the highest to the lowest temperature and back again. In 
the following such a round trip will be called tunneling. 
We clearly see that with increasing number of sweeps per 
replica the tunneling time converges to the value of an un- 
biased random walk (indicated by the arrow in the lower 
right corner) consisting of two legs of length (N — 1). 
The limit for one round trip is hence given by 2{N — 1)^. 
If we choose Afiocai(/3) = Tcan(/3), the correlation between 
adjacent replicas is negligible and each replica performs a 
random walk through temperature space (see lower plot 
in Fig. (2). Furthermore, the sweeps needed for a tun- 
neling event are close to the theoretical value, as is also 
reflected in the inset of Fig. [Tl where we show that the 
fraction of replicas moving "up" in the inverse temper- 
ature is an almost linear function of f3. This is a major 
difference to FBO-PT (28|], where the linear relation holds 
only for the fraction of replicas moving "up" as a function 
of the replica index. In the inset of Fig. [3] we compare 
the computational cost of our improved PT (denoted by 
FTr) with that for standard PT and FBO-PT, showing 
that for moderate values of Mocai the computational ef- 
fort is the same for both improved methods. To keep the 



comparison fair, we have excluded the additional com- 
puter time needed for FBO-PT to determine the set of 
inverse temperatures and for PTr to obtain the local au- 
tocorrelation times. If one increases Mocai, the ratio of 
tunnelings per CPU time decreases, i.e., above a certain 
threshold value of Mocai the computational effort of PT,- 
increases faster than the improvement of the tunneling 
speed. 

To compare our improved PT,- with other methods one 
should not only look at the computational cost but also 
at the accuracy that is achieved for the same amount of 
measurements. An easy way to check this is to measure 
the autocorrelation time t. In Fig. d] we show the auto- 
correlation times of the 2D Ising model with L — 80 for 
standard PT, FBO-PT, and our PT^ algorithm with dif- 
ferent choices of Mocai(/3). The improvement gained by 
using PT instead of simulating each temperature inde- 
pendently is almost one order of magnitude in the region 
around the critical point. If one rearranges the inverse 
temperatures according to the FBO-PT algorithm one 
finds even smaller autocorrelation times around /3c, but 
the improvement away from criticality is less pronounced 
than for the standard PT method. Taking in PT^ the 
local autocorrelation times Tcan(/3) into account we can 
decrease r systematically. For Mocai(/3) = T-can(/3)/64, 
where for all temperatures the autocorrelation times of 
the PT,- simulation are slightly smaller than for FBO- 
PT, the computational effort is almost equal for the two 
methods. If we use iViocai(/3) = Tcan(/3), then the autocor- 
relation times are smaller than unity for all temperatures 
and the resulting time series are nearly uncorrelated, but 
the computational costs are clearly too high to make this 
choice useful. 

We close with a brief remark on applications of our 
PTr algorithm to a MC study of the 3D Edwards- 
Anderson Ising spin-glass model on a i = B'^ lattice sim- 
ulated in a temperature range from 0.75 to 1.7 around 
Tc ~ 1.15. Using the same procedure as described above, 
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FIG. 5: Fraction of replicas which wander from the smallest 
Pi to the largest as a function of the replica index i for the 3D 
Ed wards- Anderson Ising spin-glass model {L = G, averaged 
over 20 disorder realizations). 



we find also here an improvement of the rephca flow from 
high to low temperatures, i.e., from the disordered to the 
spin-glass phase (see Fig. [5|). However, the additional 
computational effort to gain this improvement is signifi- 
cant due to the exponential increase of the autocorrela- 
tion time with decreasing temperature. Therefore, one 
has to carefully tune the balance between used computer 
time and quality of results. 

To summarize, we discovered a remarkable block build- 
ing structure in PT simulations, revealed the mechanism 
behind it and showed how to cure this problem by taking 
into account the temperature dependence of autocorre- 
lation times. This demonstrates how easily the quality 
of PT simulation data can be improved both in MC and 
MD studies. 
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